SIMD Types Example: Matrix Multiplication [N4454]
نویسنده
چکیده
This document describes one possible implementation of a matrix class and matrix multiplication using the data-parallel SIMD types introduced in [N4184]. The example shows the basic use of SIMD types for manual transformation of a loop over scalars to a loop with increased stride using SIMD vector loads and stores and SIMD operations in the loop body.
منابع مشابه
Breaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors
The low utilization of SIMD units and memory bandwidth is the main performance bottleneck on SIMD processors for sparse matrix-vector multiplication (SpMV), which is one of the most important kernels in many scientific and engineering applications. This paper proposes a hybrid optimization method to break the performance bottleneck of SpMV on SIMD processors. The method includes a new sparse ma...
متن کاملSegmented Arithmetic Operators for Graphics Processing
Graphics processing relies on executing similar instructions repeatedly on a large data set. This parallelism in the data gives rise to the Single-Instruction Multiple-Data (SIMD) paradigm which is used in modern processors. This paper explores several techniques that exploit the parallelism in the SIMD execution functional units and proposes several new SIMD methods. The methods discussed in t...
متن کاملLarge Matrix Multiplication on a Novel Heterogeneous Parallel DSP Architecture
This paper introduces a novel master-multi-SIMD on-chip multi-core architecture for embedded signal processing. The parallel architecture and its memory subsystem are described in this paper. We evaluate the large size matrix multiplication performance on this parallel architecture and compare it with a SIMD-extended data parallel architecture. We also examine how well the new architecture scal...
متن کاملExperimental Evaluation of A ne Schedules for Matrix Multiplication on the MasPar Architecture
This paper reports an experimental study on the suitability of systolic algorithms scheduling methods to the automatic parallelization of algorithms on SIMD computers. We consider the matrix multiplication on the MasPar MP-1 architecture. We comparatively study diierent scheduling methods and the blocking of the best resulting algorithms.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015